A Powerful and Versatile XML Format for Representing Role-semantic Annotation

نویسندگان

  • Katrin Erk
  • Sebastian Padó
چکیده

We present two XML formats for the description and encoding of semantic role information in corpora. The TIGER/SALSA XML format provides a modular representation for semantic roles and syntactic structure. The Text-SALSA XML format is a lightweight version of TIGER/SALSA XML designed for manual annotation with an XML editor rather than a special tool. Both formats can deal with underspecification, roles crossing the sentence boundary, compound splitting, and whole-sentence tags for meta-level comments.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

XARA: An XML- and Rule-based Semantic Role Labeler

XARA is a rule-based PropBank labeler for Alpino XML files, written in Java. I used XARA in my research on semantic role labeling in a Dutch corpus to bootstrap a dependency treebank with semantic roles. Rules in XARA are based on XPath expressions, which makes it a versatile tool that is applicable to other treebanks as well. In addition to automatic role annotation, XARA is able to extract tr...

متن کامل

Representing NCBO Annotator Results in Standard RDF with the Annotation Ontology

Semantic annotation is part of the Semantic Web vision. The Annotation Ontology is a model that have been proposed to represent any annotations in standard RDF. The NCBO Annotator Web service is a broadly used service for annotations in the biomedical domain, offered within the BioPortal platform and giving access to more than 350+ ontologies. This paper presents a new output format to represen...

متن کامل

Representing and Accessing Multi-Level Annotations in MMAX2

MMAX21 is a versatile, XML-based annotation tool which has already been used in a variety of annotation projects. It is also the tool of choice in the ongoing project DIANA-Summ, which deals with anaphora resolution and its application to spoken dialog summarization. The project uses the ICSI Meeting Corpus (Janin et al., 2003), a corpus of multi-party dialogs which contains a considerable amou...

متن کامل

First steps towards an ISO standard for annotating discourse relations

This paper describes initial studies in the context of a new effort within ISO to design an international standard for the annotation of discourse with semantic relations that are important for its coherence, “discourse relations”. This effort takes the Penn Discourse Treebank (PDTB) as its starting point, and applies a methodology for defining semantic annotation languages which distinguishes ...

متن کامل

An Ontology-Based Multimedia Annotator for the Semantic Web of Language Engineering

The development of the Semantic Web, the next-generation Web, greatly relies on the availability of ontologies and powerful annotation tools. However, there is a lack of ontology-based annotation tools for linguistic multimedia data. Existing tools either lack ontology support or provide limited support for multimedia. To fill the gap, we present an ontology-based linguistic multimedia annotati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004